Summary
OpenAI's o3 and o4-mini AI models have been found to hallucinate more frequently than previous models. This is concerning as it may impact the accuracy of these models. Additionally, OpenAI is unsure why this is happening and has acknowledged that more research is needed to understand and address this issue.
Key Points
o3 and o4-mini AI models have been found to hallucinate in response to 33% and 48% of questions on PersonQA, respectively
Transluce, a nonprofit AI research lab, has also found evidence that o3 tends to make up actions it took in the process of arriving at answers
OpenAI's GPT-4o with web search achieves 90% accuracy on SimpleQA
Why It Matters
Addressing hallucinations across all models is crucial as it can impact the accuracy and reliability of AI-driven systems.
Author
Maxwell Zeff